ARTS: autonomous research topic selection system using word embeddings and network analysis

نویسندگان

چکیده

Abstract The materials science research process has become increasingly autonomous due to the remarkable progress in artificial intelligence. However, topic selection (ARTS) not yet been fully explored difficulty of estimating its promise and lack previous research. This paper introduces an ARTS system that autonomously selects potential topics are likely reveal new scientific facts have subject much by analyzing vast numbers articles. Potential selected difference between two concept networks constructed from information articles: one represents is word embeddings, known past activities statistical on appearance patterns concepts. also equipped with functions search visualize about assist final determination a scientist. We developed using approximately 100 00 articles published Computational Materials Science journal. results our evaluation demonstrated studied after 2016 could be generated analysis before 2015. suggests can effectively system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Correlated Topic Model Using Word Embeddings

Conventional correlated topic models are able to capture correlation structure among latent topics by replacing the Dirichlet prior with the logistic normal distribution. Word embeddings have been proven to be able to capture semantic regularities in language. Therefore, the semantic relatedness and correlations between words can be directly calculated in the word embedding space, for example, ...

متن کامل

Topic Modeling Using Distributed Word Embeddings

We propose a new algorithm for topic modeling, Vec2Topic, that identifies the main topics in a corpus using semantic information captured via high-dimensional distributed word embeddings. Our technique is unsupervised and generates a list of topics ranked with respect to importance. We find that it works better than existing topic modeling techniques such as Latent Dirichlet Allocation for iden...

متن کامل

Topic Modelling with Word Embeddings

English. This work aims at evaluating and comparing two different frameworks for the unsupervised topic modelling of the CompWHoB Corpus, namely our political-linguistic dataset. The first approach is represented by the application of the latent DirichLet Allocation (henceforth LDA), defining the evaluation of this model as baseline of comparison. The second framework employs Word2Vec technique...

متن کامل

Enhancing Feature Selection Using Word Embeddings

Health surveillance systems based on online user-generated content often rely on the identification of textual markers that are related to a target disease. Given the high volume of available data, these systems benefit from an automatic feature selection process. This is accomplished either by applying statistical learning techniques, which do not consider the semantic relationship between the...

متن کامل

Topic Sentiment Joint Model with Word Embeddings

Topic sentiment joint model is an extended model which aims to deal with the problem of detecting sentiments and topics simultaneously from online reviews. Most of existing topic sentiment joint modeling algorithms infer resulting distributions from the co-occurrence of words. But when the training corpus is short and small, the resulting distributions might be not very satisfying. In this pape...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine learning: science and technology

سال: 2022

ISSN: ['2632-2153']

DOI: https://doi.org/10.1088/2632-2153/ac61eb